home *** CD-ROM | disk | FTP | other *** search
Text File | 1996-07-21 | 39.5 KB | 889 lines | [TEXT/R*ch] |
- Update v2.0.1: FileFlex International WorldFlex Functions
-
- * Understanding Character-Level Sort Order
- * Custom Character Sort Orders
- * Creating a Single-Byte Custom Sort Order Table
- * Creating Single-Byte Sort Order Utility Scripts
- * Understanding Double-Byte Sort Order Tables
- * Creating Double-Byte Sort Order Tables
- * Tricks with Sort Order
- * Setting the Sort Order with FileFlex
-
- * Character Translation
- * Creating Character Translation Utility Scripts
- * Translating Characters Using FileFlex
-
- * Case Translation
- * Creating Case Translation Utility Scripts
- * Intelligent Case Conversion Using FileFlex
- * Standalone Intelligent Case Conversion Function
-
- FileFlex is used within multimedia productions throughout the world. While
- standard ASCII is prevalent, it is certainly not ubiquitous. When dealing
- with international languages, it's necessary to account for differences in
- character sorting order, for differences in case conversion, for differences
- in character values, and for double-byte characters.
-
- FileFlex new WorldFlex technology now gives you the ability to build
- international flexibility into your applications with unprecedented power.
- FileFlex' WorldFlex technology gives you true dynamic localization. Unlike
- virtually all other so-called "world-aware" implementations, you're not
- forced to rely on a particular operating system revision or a
- country-nationalized version of an application. FileFlex allows you to
- define your own international conversion tables and apply them on-the-fly to
- any data management task. This dynamic localization functionality allows you
- to switch languages, character sets, sort orders, and conversions at any
- time throughout the operation of your multimedia production instantly, with
- virtually no impact upon FileFlex' already blazing performance.
-
- FileFlex WorldFlex' technology falls into these three broad categories:
-
- * Dynamic character-level sort order: FileFlex allows you to use indexes
- and queries that dynamically switch between sort-order tables. Finally,
- an accented "a" character is treated like a regular "a", rather than
- something from Mars. Sort orders can be specified for either
- single-byte or double-byte languages.
-
- * Character translation: As many FileFlex users have discovered, the
- special diacritical characters have different values between Macintosh
- and Windows, and even between DOS and Windows. FileFlex allows you to
- convert characters so that all the diacritical marks (and any other
- conversions you may need) are all in the right places and your
- characters look just right.
-
- * Case conversion: Normal case conversion routines apply a simple
- heuristic to determine the upper case value of a character. Converting
- an "a" to an "A" is simply the matter of subtracting 32. But what about
- converting a "u" with an umlaut to an upper case value? What about
- converting vowels with accents to their equivalent upper case
- characters? FileFlex provides two standalone functions that allow you
- to use custom case conversion tables so that your case conversions make
- sense in your language. FileFlex internal intrinsic index and query
- functions also take into account custom case conversion tables so your
- data can be case insensitive when desired (as opposed to case insane).
-
- Before we proceed with details of these functions, we'd like to thank our
- customers throughout the world for working with us to understand the
- individual needs of different languages and customs and how those needs
- apply to the authoring of multimedia productions worldwide.
-
- Understanding Character-Level Sort Order
-
- ----------------------------------------------------------------------------
- Note: The character-level sorting features in FileFlex require that you have
- a measurable amount of programming expertise. These features let you modify
- the very core of FileFlex data management and require both care to use and
- experience to understand. If you're not a pretty advanced scripter or
- programmer, you may want to find an experienced "buddy" to team up with
- before attempting to utilize these powerful capabilities.
- ----------------------------------------------------------------------------
- FileFlex uses index files to sort information. When you create an index
- file, you're choosing a field that will determine the sort order of the
- database. For example, you might choose to sort on zipcode (a numeric code
- in the US that helps the post office tell where to deliver mail--in other
- countries this is often called the postal code), meaning that records
- containing 08553 in the zipcode field will be earlier in the database than
- records with 94404 in the zipcode field. Likewise, if you chose to organize
- your data based on last name, then "Clinton" would come before "Kennedy".
-
- When you switch indexes, FileFlex doesn't reorder the entire database of
- records. Instead it adopts a different sort order based on the data in the
- fields. FileFlex creates the order of information in an index file when
- DBCreateIndex is called. It maintains and updates that order of information
- as part of the process of writing a record.
-
- When FileFlex updates an index file, it's comparing the values in two
- different records. When it looks at "Clinton" and "Kennedy", it looks at the
- first characters (i.e., "C" and "K") and determines that "C" comes before
- "K" and therefore "Clinton" comes before "Kennedy".
-
- This comparison of "C" vs. "K" is based on the standard ordered table we
- call ASCII (American Standard Code for Information Interchange). When
- FileFlex compares "C" against "K", it's really getting the ASCII value of
- "C" (67 decimal) and comparing it to the ASCII value of "K" (75 decimal).
- Since 67 comes before 75, then "C" comes before "K".
-
- Note: Character sorting is case sensitive. A lower case "c" is ASCII 99
- while an upper case "C" is ASCII "67". If you were to compare "clinton"
- (note the lower case "c") against "Kennedy", "Kennedy" would come first
- because of the ASCII value of "K" (ASCII 75) is less than that of lower case
- "c".
-
- So, when FileFlex looks at "CLINTON" and "KENNEDY", it's really looking at
- the comparative weights (or priorities) of the individual characters,
- according to their representation in ASCII. Here's the two strings and their
- corresponding values:
-
- C L I N T O N
- 67 76 73 78 84 79 78
- | | | | | | |
- 75 69 78 78 69 68 89
- K E N N E D Y
-
- Custom Character Sort Orders
-
- FileFlex' new WorldFlex technology allows you to customize the
- character-level sort order used by the FileFlex indexing routines. There are
- two primary reasons you might want to do this:
-
- * To sort in descending rather than ascending order
-
- * To sort according to sorting rules different than ASCII, in particular
- for languages other than English.
-
- In fact, a very important part of WorldFlex technology is the ability to
- change the sort order of your characters, and thereby sort your database
- according to the sorting rules you feel are currently appropriate.
-
- Many so-called "internationalized", "localized", or "world-aware" systems do
- provide support for character sorting order for multi-country use. But they
- are usually available only when you're running the localized version of the
- operating system or database application. While many of orur friends outside
- the US are grateful for any mechanism that recognizes their native language,
- FileFlex doesn't stop there. FileFlex' new WorldFlex technology is vastly
- more powerful. FileFlex allows you to change your sorting order on-the-fly,
- as you switch index files. Nothing else can do this!
-
- Here's an example of where this is so powerful: Imagine you're a
- multi-national firm with customers throughout the world. When you do a query
- to list your customers in the US, the ASCII sort order is just fine. But
- when you do a query to list customers in Japan, you want the customers'
- names sorted by the appropriate sorting conventions for the Japanese
- language and character sets--not according to the rather provincial
- expectations of ASCII. With FileFlex, you can switch from an ASCII index to
- an index ordered according to Japanese sort order absolutely instantly.
-
- Creating a Single-Byte Custom Sort Order Table
-
- Character sort orders are controlled by a custom sort order table. For
- applications and languages that use single-byte characters (typically,
- "roman" languages), each character can be represented by a single byte.
- Since a byte is 8-bits wide, this allows for 256 characters.
-
- You create a sort order table in your host development environment's
- programming language (our examples will be in Director's Lingo). We do this
- by building a table containing three bytes of data for each character in the
- sort order:
-
- * Leader Flag Byte: For single-byte languages, this byte is always set to
- 255.
- * Priority Multiplier Byte: For single-byte languages, this byte is also
- always set to 255.
- * Priority Value Byte: This value signifies the priority of the character
- in the list (never use 0).
-
- At the end of all of the three-byte sets, a single byte containing the value
- 0 is used to terminate the table.
-
- Before we look in more detail at the Priority Value Byte, let's first look
- at how ASCII prioritizes it's characters:
-
- A B C D E ... V W X Y Z
- 65 66 67 68 69 ... 86 87 88 89 90
-
- Since "A" is an ASCII 65, it's got a lower value than "D", which is an ASCII
- 68. The numbers 65 and 68 correspond to the priority value of the various
- letters. Likewise, in FileFlex' custom sort order tables, the lower priority
- value number, the earlier in the sort the character will be placed. If we
- wanted to sort in reverse order ("Z" before "A"), we could assign different
- priority values, giving "Z" a much lower number than "A", as in the
- following list:
-
- Z Y X W V ... E D C B A
- 65 66 67 68 69 ... 86 87 88 89 90
-
- With the priorities show above, if we looked up a "D", we'd see it's value
- was 87. Since an "A" has a priority value of 90, the "D" would come earlier
- in the list. If we used this set of priority values, "KENNEDY" would
- certainly appear before "CLINTON".
-
- It's important to remember that the priority value is entirely up to you. If
- you wanted all words with vowels (A, E, I, O, and U) to come at the
- beginning of the list, you might create the following table of priority
- values:
-
- A E I O U B C D F G H J K L
- 65 66 67 68 69 70 71 72 73 74 75 76 77 78 ...
-
- FileFlex determines where in the sort order table to find a priority value
- based on the character's actual computer-code value (usually ASCII). So,
- since "A" has the ASCII code value of 65, FileFlex will look in the 65th
- entry in the sort order table to retrieve the priority value. Let's make
- this a bit clearer by constructing a partial sort order table for
- traditional ASCII (note, we're showing all three data bytes as described
- above and all numbers are in base-10):
-
- Entry Pos 65 66 67 68
- US Char "A" "B" "C" "D"
- Data Bytes 255 255 065 255 255 066 255 255 067 255 255 068
-
- Entry Pos 69 70 71 72
- US Char "E" "F" "G" "H"
- Data Bytes 255 255 069 255 255 070 255 255 071 255 255 072
-
- Entry Pos 73 74 75 76
- US Char "I" "J" "K" "L"
- Data Bytes 255 255 073 255 255 074 255 255 075 255 255 076
-
- So, to create a FileFlex sort order table that matches traditional ASCII in
- ascending order, you'd want "A" to have a sort order priority of 65, so the
- third data type at position 65 would be the value 65.
-
- Now let's look at how the table would change if we wanted to sort everything
- in reverse order (note that we've reversed the entire ASCII character set):
-
- Entry Pos 65 66 67 68
- US Char "A" "B" "C" "D"
- Data Bytes 255 255 190 255 255 189 255 255 188 255 255 187
-
- Entry Pos 69 70 71 72
- US Char "E" "F" "G" "H"
- Data Bytes 255 255 186 255 255 185 255 255 184 255 255 183
-
- Entry Pos 73 74 75 76
- US Char "I" "J" "K" "L"
- Data Bytes 255 255 182 255 255 181 255 255 180 255 255 179
-
- Using the above table, when FileFlex encounters the character "A", which has
- the ASCII value of 65, it looks at the 65th entry in the table. It then
- retrieves the priority value, which is 190. If FileFlex then looks for "C"
- (in the 67th entry in the table), it retrieves the priority value of 188.
- Since 188 is less than 190, FileFlex will put "C" before "A".
-
- Creating Single-Byte Sort Order Utility Scripts
-
- The best way to create the sort order table is to write a simple utility
- script. Here's an example script that simply builds the ASCII order in ASCII
- order:
-
- on buildSortOrder_ASCII
- global ASCII
- put "" into theTable
- repeat with i = 0 to 255
- put the number of chars of theTable into theChar
- put numToChar(255) after theTable -- no leader char
- put numToChar(255) after theTable -- priority multiplier of 0
- if i = 0 then
- put numToChar(255) after theTable -- use 255 in byte 0
- else
- put numToChar(i) after theTable -- priority value
- end if
- end repeat
- put numToChar(0) after theTable -- terminator byte code
- put theTable into ASCII
- end buildSortOrder_ASCII
-
- Note the name of the handler is "BuildSortOrder_ASCII". We've developed a
- convention where the routine that builds the sort order is called
- "BuildSortOrder_" and the name of the sort order itself is appended to the
- end. The sort order table is placed in a global variable of the same name.
- So, for a sort order for French Canadian, we recommend naming the handler
- "BuildSortOrder_FrenchCanadian" and the global variable containing the sort
- order "FrenchCanadian".
-
- Note that the routine above places the actual byte value into the string by
- using numToChar(x). This places a single byte value corresponding to the
- number in the string location. Each set of data bytes in the table gets two
- bytes with 255 (for the leader char and priority page 0), and the byte
- corresponding to the priority value. Finally, after all the data byte sets
- are added to the string, BuildSortOrder_ASCII appends a terminator byte
- (value 0).
-
- Here's an example routine that reverses the ASCII sort order, placing the
- table in the global ASCIIReverse:
-
- on buildSortOrder_ASCIIReverse
- global ASCIIReverse
- put "" into theTable
- put 255 into priority
- repeat with i = 0 to 255
- put the number of chars of theTable into theChar
- put numToChar(255) after theTable -- no leader char
- put numToChar(255) after theTable -- priority multiplier of 0
- if i = 0 then
- put numToChar(255) after theTable -- use 255 in byte 0
- else
- put numToChar(priority) after theTable -- priority value
- end if
- put priority-1 into priority
- end repeat
- put numToChar(0) after theTable -- terminator byte code
- put theTable into ASCIIReverse
- end buildSortOrder_ASCIIReverse
-
- ----------------------------------------------------------------------------
- WARNING: Make absolutely certain you end each sequence with a numToChar(0)
- terminator byte. Failure to do this could cause FileFlex to scan beyond the
- end of the sort order table and the results could be unpredictable and your
- program could abnormally terminate.
- ----------------------------------------------------------------------------
-
- Understanding Double-Byte Sort Order Tables
-
- If the language you're sorting uses double-byte characters (like certain
- Japanese and Chinese character sets), you'll need to create double-byte sort
- order tables. Double-byte character sets are different because they use two
- bytes for many characters. The computer distinguishes between a standard
- single-byte character and a dual-byte character by the existence of a leader
- byte. This leader byte tells the computer that the byte that follows the
- leader byte is to be treated as a special character, rather than simply part
- of the standard ASCII table.
-
- FileFlex sort order tables are not limited to 256 bytes. Instead, they can
- be anywhere from 256 bytes long to 65,280 bytes long (255 * 256). Each set
- of 256 bytes in the sort order table is called a "sort order page" and the
- maximum number of sort order pages allowed by FileFlex is 255.
-
- If you recall from earlier, each character value is represented in the sort
- order table by three bytes, a leader char byte, a priority multiplier byte,
- and a priority value byte. Also, if you recall, the leader char byte for
- single-byte sort order tables was always 255. That told FileFlex to look in
- the very first page of the sort order table (i.e., the very first set of 256
- bytes) for the character's priority value.
-
- When you're using double-byte character sets, you'll need more than one
- 256-byte page to represent the sort order. The value that's placed in the
- leader character tells FileFlex in which sort order page to look for the
- priority value of the character which follows the leader character. Let's
- diagram that out:
-
- Suppose that your language character set uses characters with the value of
- 128 as a leader character. Now, let's suppose your database has a
- double-byte character with the values 128 and 065 respectively for the two
- bytes. Here's how the sort order table might be be defined:
-
- Sort order page 0
- ---------------------------
- Position #128: 001 255 255
-
- Sort order page 1
- ---------------------------
- Position #65: 255 255 015
-
- When reading the character stream, FileFlex would read the first byte and
- determine it's value was 128. It would then go to position 128 in the sort
- order table and read the first byte. Since the first byte (the leader byte
- flag) is not a 255, it would know that 128 was a leader byte. Since the
- leader byte flag is 1, FileFlex would know that the next character retrieved
- should be compared against sort order page 1 (located in the second bank of
- 256 bytes).
-
- FileFlex would now read the second byte of the character. Since it knows
- that this character is the second of a double-byte character set, FileFlex
- will then determine the character's value (in this case 65) and jump 65
- bytes into the second sort order page (or to byte 321...256+65...of the full
- sort order table). 321 bytes into the table (position 65 in the second page)
- FileFlex would look at the priority value byte and determine that the
- priority of the character represented by 128 065 is 15.
-
- Creating Double-Byte Sort Order Tables
-
- You create a double-byte sort order table very much like you would a
- single-byte table. You create sets of three-byte sequences for each
- character. For each sort order page, you create 256 of these three byte
- sets. At the very end, you place a single byte value of 256 that signifies
- the termination of the table.
-
- You should probably lay out the sort order tables on paper before you
- attempt to write the code to generate a table.
-
- First, you should determine those byte values that are leader bytes. For
- every unique leader byte value, assign a sort order page, from page 1 to
- 254. Obviously, you want to keep the number of absolute sort order pages
- down as much as possible to make things run faster and to use less memory.
- For each leader byte in the sort order byte triplet, make sure you've set
- the following two bytes to 255.
-
- Next, fill in all the other remaining values in the first 256 byte page. For
- each character, assign a weighted value and place that in the third byte of
- the data triplet.
-
- Note: you can use the second byte of the data triplet as a priority
- multiplier. If you need priorities higher than 255, use the priority
- multiplier byte by setting it to anything between 1 (earliest in the
- priority order) to 254 (last in the priority search list order).
-
- After you've filled in the first sort order page, you can then create the
- subsequent pages. In these pages, the first byte of the triplet will always
- be 255, the second byte between 1 and 254 depending on your desired priority
- multiplier, and the third value byte also between 1 and 254.
-
- Finally, append a terminator byte--which needs to be a charToNum(0) value.
-
- Once you've layed all this out on paper, you can write a BuildSortOrder_
- routine that will create a global variable containing your sort order.
-
- Tricks with Sort Order
-
- You can do some pretty interesting things with sort orders besides handling
- international issues. For example, lets assume you wanted to sort numerical
- data which you stored in a character field.
-
- Note: You should generally do this because the DBF format stores numbers as
- ASCII values internally. But if you use character fields to store numbers,
- you get to manipulate values with more control (i.e., sort order).
-
- So, again, let's assume you've got a character field containing numeric
- data. Sometimes, in a numeric field, you might want to have spaces or
- asterisks instead of zeros, like in the following example:
-
- "0002598"
- " 2598"
- "***2598"
-
- When creating a custom sort order table for numerical sorts in character
- fields, you can give the space character (ASCII 32), the asterisk character
- (ASCII 42), and the zero all the same priority value weighting. This would
- cause the sorting/seeking routines to treat all three characters the same.
-
- This kind of "equalizing" of sorting values also applies to those special
- international characters, like letters with umlauts (e.g., the double-dots)
- or accent marks over characters. You might want to treat a lower case 'a'
- and a lower-case 'a' with an accent mark as the same character in sort
- order.
-
- You can also do this with upper and lower case values. If you want upper
- case and lower case letters to be sorted together, give them the same
- priority value.
-
- Setting the Sort Order with FileFlex
-
- You can tell FileFlex to use a new sort order with the FileFlex command
- DBSetSortOrder. Unlike most FileFlex commands, DBSetSortOrder is a wrapper
- script that does not call FileFlex directly. Instead, DBSetSortOrder sets
- two FileFlex global properties: gDBWorldSort and gDBSortOrder.
-
- Note: I almost named the gDBSortOrder variable gDBWorldOrder. Then the
- function would have been DBSetWorldOrder. But that seemed far too
- Republican, so I restrained myself. Wouldn't it be great if you could write
- a new translation table, give a quick call to DBSetWorldOrder, and--poof--a
- new world order emerges? It gives new (and terrifying meaning) to the phrase
- "FileFlex users rule!" [chuckle] [[shiver]].
-
- Here's the Lingo code for DBSetSortOrder:
-
- on DBSetSortOrder order
- global gDBWorldSort
- global gDBSortOrder
- if order = EMPTY then
- put EMPTY into gDBWorldSort
- else
- put "1" into gDBWorldSort
- put order into gDBSortOrder
- end if
- return 0
- end DBSetSortOrder
-
- When you call DBSetSortOrder, you want to pass your sort order table. Here's
- an example:
-
- put DBSetSortOrder(ASCII) into DBResult
-
- To disable custom sort order processing, set the sort order to the empty
- string:
-
- put DBSetSortOrder("") into DBResult
-
- Inside of FileFlex is a C++ function called worldCompare(). When a
- DBCreateIndex or DBSeek command is executed, at some time, the internal
- worldCompare routine is called upon to compare two strings. When
- worldCompare is called, it asks the host development environment (i.e.,
- Director) for the value of the reserved global variable gDBWorldSort. If
- worldCompare discovers that gDBWorldSort is not empty, it then asks the host
- environment for the contents of the global variable gDBSortOrder and uses
- that to control the comparison of two strings.
-
- Hint: One of the reasons building a sort order table is so complex and
- precise is you're building an actual binary data structure that FileFlex can
- use directly. While the table may be a bit painful to design once, this
- mechanism allows FileFlex to do custom comparisons and switch sort order
- tables at blinding speed.
-
- To turn off a sort order table, send the empty string to DBSetSortOrder.
- When this happens, the global gDBWorldSort is set to the empty string.
- FileFlex then knows to skip the extra processing inherent in comparing
- world-aware data strings.
-
- Cautions: The sort order impacts the internal compare functions; it does not
- reorder the dataset or the index. As a result, you should set your sort
- order BEFORE you call DBCreateIndex and you should always use the
- appropriate sort order table when doing a DBSeek or DBSelectIndex. Failure
- to do this could cause your data to appear out of order. When writing
- records, try not to get in the situation where two different sort orders
- need to be active when writing one record.
-
- Here's a sample script from the Sort Order demo file:
-
- on mouseUp
- global ASCIIReverse -- the reverse sort order table
- -- initialize FF session
- put DBOpenSession() into dbresult
- if dbResult < 0 then
- alert "FileFlex could not initialize!"
- exit
- end if
- -- open a database file
- put dbUse(field "theDBFile") into dbID
- if dbID < 0 then errorClose "Could not open database file."
- --
- -- create a a custom index on TITLE using ASCIIReverse
- --
- buildSortOrder_ASCIIReverse -- build the sort order
- put DBSetSortOrder(ASCIIReverse) into dbResult
- put "Creating index file..." into field "status"
- updateStage
- put dbCreateIndex("REVASCII","TITLE","0","0") into ndxID
- if ndxID < 0 then errorClose "Could not create index file."
- -- fill the list
- put "Scanning data file..." into field "status"
- updateStage
- put DBSelectIndex(ndxID) into dbResult
- if dbResult < 0 then errorClose "Could not select index file."
- put "" into theList
- put DBTop() into dbResult
- repeat while 1 = 1 -- forever
- if theList <> "" then put return after theList
- put DBGetFieldByName("TITLE") into title
- updateStage
- put title after theList
- if DBSkip(1) = 3 then exit repeat
- end repeat
- put theList into field "movie list"
- updateStage
- put DBSetSortOrder(EMPTY) into dbResult -- turn off
- put DBCloseSession() into dbresult
- if dbResult < 0 then
- alert "FileFlex could not terminate!"
- exit
- end if
- put "Processing complete..." into field "status"
- updateStage
- end
-
- on errorClose s
- alert s
- put DBCloseSession() into dbresult
- if dbResult < 0 then
- alert "FileFlex could not terminate!"
- abort
- end if
- abort
- end errorClose
-
- ----------------------------------------------------------------------------
- Important: FileFlex uses the xBASE/dBASE III standard format. This format
- does not permit 8-bit deep characters in memo fields contained within DBT
- files. Attempting to do character translation to characters greater than 128
- can cause this format difficulties. If you need to store non-ASCII text in
- memo fields, you should either use a custom translation table or store your
- data in text files and refer to those files from FileFlex fixed-length
- fields.
- ----------------------------------------------------------------------------
-
- Character Translation
-
- If you're using a language that has special characters in it's character
- sets (i.e., accent marks, umlauts, and other specialty characters), you may
- run into an interesting problem moving documents from Macintosh to Windows
- or vice-versa. That's because while ASCII is cleanly defined for the US
- English character set of "a-zA-Z", that does not mean that character values
- of special characters are uniformly used across platforms.
-
- FileFlex user Antonio Lucena of Madrid, Spain describes the conversion issue
- as it pertains to DOS vs. Windows files as well:
-
- "The problem is that Windows uses different character set than MS-DOS (and
- the databases created with dBASE). MS-DOS uses OEM Char set, and Windows
- uses ANSI. For example in OEM, a diacritical "e" is numbered 130, but in
- ANSI, same "e" is numbered 233. The same problem appears when you open a
- document (with diachitical vowels on it) made with the EDIT tool from MS-DOS
- and you try to open it with the WRITE tool from Windows and no previous
- conversion was made."
-
- Note: The above message illustrates the value of the free fileflex-talk
- mailing list. Another user had discovered the translation problem and by
- asking questions to this user and making that dialog public via
- fileflex-talk, Antonio was able to see the message and contribute his
- feedback. With feedback from him and others, we were able to identify the
- need for the new DBTranslateChars function described below.
-
- FileFlex WorldFlex technology provides for character-level translation using
- much the same mechanism as used for developing sort order tables. You
- develop a translation table that describes the new and old values and pass
- it to FileFlex along with a container of characters to be translated.
-
- Setting up a character translation table is very straightforward. You need
- to build a Lingo string consisting of 256 characters. The position in the
- string is the value of the old character and the value at that position
- becomes the new character.
-
- Note: The first character in the string is considered "position 0" by
- FileFlex. Also note that you cannot place a 0 into any character position.
- If you do not want translation, place the corresponding character value into
- that position or the value 255.
-
- Creating Character Translation Utility Scripts
-
- The best way to create the character translation table is to write a simple
- utility script. Here's an example script that simply contains the ASCII
- character set:
-
- on buildTranslateTable_ASCIIX
- global ASCIIX
- put "" into theTable
- repeat with i = 0 to 255
- if i = 0 then
- put numToChar(255) after theTable -- use 255 in byte 0
- else
- put numToChar(i) after theTable -- position in table
- end if
- end repeat
- put theTable into ASCIIX
- end buildTranslateTable_ASCIIX
-
- Note the name of the handler is "BuildTranslateTable_ASCIIX". We've
- developed a convention where the routine that builds the translation table
- is called "BuildTranslateTable_" and the name of the translation itself is
- appended to the end. In order to prevent confusion from sort order tables,
- we've also placed an X after every translation table ("X" for an often used
- abbreviation for translate, which is "Xlate"). The translation table is
- placed in a global variable of the same name. So, for a translation table
- that converts to Windows diacriticals, we recommend naming the handler
- "BuildTranslateTable_WinCharX" and the global variable containing the sort
- order "WinCharX".
-
- Here's an example routine that converts upper case to lower case (and the
- reverse):
-
- on buildTranslateTable_CaseReverseX
- global CaseReverseX, ASCIIX
- buildTranslateTable_ASCIIX
- put ASCIIX into theTable
- -- fill in lower case
- repeat with i = 65 to 90
- put numToChar(i+32) into char i+1 of theTable
- -- using i+1 above because strings begin at 1, not 0
- end repeat
- -- fill in upper case
- repeat with i = 97 to 122
- put numToChar(i-32) into char i+1 of theTable
- end repeat
- put theTable into CaseReverseX
- end buildTranslateTable_CaseReverseX
-
- The above routine reverses the case, so an upper case "A" becomes a lower
- case "a" and vice versa. To create a routine that always converts to upper
- case, make both sets of characters upper case. Likewise, to create a routine
- that always converts to lower case, make both sets of characters lower case.
- Here's an UpperX routine:
-
- on buildTranslateTable_UpperX
- global UpperX, ASCIIX
- buildTranslateTable_ASCIIX
- put ASCIIX into theTable
- -- fill in upper case
- repeat with i = 97 to 122
- put numToChar(i-32) into char i+1 of theTable
- -- using i+1 above because strings begin at 1, not 0
- end repeat
- put theTable into UpperX
- end buildTranslateTable_UpperX
-
- ----------------------------------------------------------------------------
- WARNING: Make absolutely certain you fill in all 256 bytes. Failure to do
- this could cause FileFlex to scan beyond the end of the translation table
- and the results could be unpredictable and your program could abnormally
- terminate.
- ----------------------------------------------------------------------------
-
- Translating Characters Using FileFlex
-
- You can use FileFlex to translate character sets within a text container
- using the DBTranslateChars function. DBTranslateChars takes two parameters:
- the string to be translated and the pre-built translation table described
- above. It returns the translated string:
-
- put DBTranslateChars(myString,CaseReverseX) into newString
-
- Here's a sample routine that will do the character translation (it
- presupposes that FileFlex has been initialized properly with DBOpenSession):
-
- on mouseUp
- global CaseReverseX
-
- buildTranslateTable_CaseReverseX
- put DBTranslateChars(field "text data",CaseReverseX)
- into field "text data"
- end mouseUp
-
- Case Translation
-
- If you're using a language that has special characters in it's character
- sets (i.e., accent marks, umlauts, and other specialty characters), you may
- run into an interesting problem converting between upper and lower case.
- With standard ASCII, it's easy to do a case conversion: just add or subtract
- 32 to the character's value. That's because in ASCII, the upper or lower
- case character is always algorithmically deterministic. However, when
- dealing with international character sets where lower case characters might
- have diacritical marks, it becomes much harder. That's because the
- characters have a wide variety of values and because there is little
- standardization.
-
- FileFlex WorldFlex technology provides for intelligent case translation
- using much the same mechanism as used for developing character translation
- tables. You develop a translation table that describes the new and old
- values and pass it to FileFlex along with a container of characters to be
- translated.
-
- You'll need to set up two case translation tables; one going to upper case
- and one going to lower case. For each table, you must build a Lingo string
- consisting of 256 characters. The position in the string is the value of the
- old character and the value at that position becomes the new character.
-
- Note: The first character in the string is considered "position 0" by
- FileFlex. Also note that you cannot place a 0 into any character position.
- If you do not want translation, place the corresponding character value into
- that position or the value 255.
-
- Creating Case Translation Utility Scripts
-
- The best way to create the case translation table is to write a simple
- utility script. Here's an example script that simply converts ASCII lower
- case to ASCII upper case:
-
- on buildCaseTable_AsciiUC
- global AsciiUC
- put "" into theTable
- -- Although it takes a few extra cycles, consider
- -- building a full table first, then modifying it below.
- -- This is much easier to understand and test.
- repeat with i = 0 to 255
- if i = 0 then
- put numToChar(255) after theTable -- use 255 in byte 0
- else
- put numToChar(i) after theTable -- position in table
- end if
- end repeat
- -- fill in upper case
- repeat with i = 97 to 122
- put numToChar(i-32) into char i+1 of theTable
- -- using i+1 above because strings begin at 1, not 0
- end repeat
- put theTable into AsciiUC
- end buildCaseTable_AsciiUC
-
- Note the name of the handler is "BuildCaseTable_AsciiUC". We've developed a
- convention where the routine that builds the translation table is called
- "BuildCaseTable_" and the name of the translation itself is appended to the
- end. In order to prevent confusion with other tables, we've also placed an
- UC after every translation table (for translation to upper case--use "LC"
- for translation to lower case). The upper case table is placed in a global
- variable of the same name.
-
- Here's the routine that translates back down to lower case:
-
- on buildCaseTable_AsciiLC
- global AsciiLC
- put "" into theTable
- -- Although it takes a few extra cycles, consider
- -- building a full table first, then modifying it below.
- -- This is much easier to understand and test.
- repeat with i = 0 to 255
- if i = 0 then
- put numToChar(255) after theTable -- use 255 in byte 0
- else
- put numToChar(i) after theTable -- position in table
- end if
- end repeat
- -- fill in lower case
- repeat with i = 65 to 90
- put numToChar(i+32) into char i+1 of theTable
- -- using i+1 above because strings begin at 1, not 0
- end repeat
- put theTable into AsciiLC
- end buildCaseTable_AsciiLC
-
- ----------------------------------------------------------------------------
- WARNING: Make absolutely certain you fill in all 256 bytes. Failure to do
- this could cause FileFlex to scan beyond the end of the translation table
- and the results could be unpredictable and your program could abnormally
- terminate.
- ----------------------------------------------------------------------------
-
- Intelligent Case Conversion Using FileFlex
-
- Case translation is used in a number of important ways within FileFlex, in
- particular within the intrinsic functions used in indexes and queries, and
- through special utility functions provided to perform simple case
- conversion.
-
- You can tell FileFlex to use a case translation table with the FileFlex
- command DBSetCaseTables. Unlike most FileFlex commands, DBSetCaseTables is a
- wrapper script that does not call FileFlex directly. Instead,
- DBSetCaseTables sets three FileFlex global properties: gDBWorldCase,
- gDBWorldUpper and gDBWorldLower.
-
- Here's the Lingo code for DBSetCaseTables:
-
- on DBSetCaseTables upperTable, lowerTable
- global gDBWorldCase
- global gDBWorldUpper, gDBWorldLower
- if (upperTable = EMPTY or lowerTable = EMPTY) then
- put EMPTY into gDBWorldCase
- else
- put "1" into gDBWorldCase
- put upperTable into gDBWorldUpper
- put lowerTable into gDBWorldLower
- end if
- return 0
- end DBSetCaseTables
-
- When you call DBSetCaseTables, you want to pass your case tables. Here's an
- example:
-
- put DBSetCaseTables(AsciiUC, AsciiLC) into DBResult
-
- To disable custom case conversion processing, set the sort order to the
- empty string:
-
- put DBSetCaseTables("") into DBResult
-
- Inside of FileFlex is a C++ function called worldUpper(). When an intrinsic
- UPPER function is executed, the internal worldUpper routine is called upon
- to do the case conversion. When worldUpper is called, it asks the host
- development environment (i.e., Director) for the value of the reserved
- global variable gDBWorldCase. If worldUpper discovers that gDBWorldCase is
- not empty, it then asks the host environment for the contents of the global
- variables gDBWorldUpper and gDBWorldLower and uses them to control the
- conversion of the strings.
-
- To turn off custom case conversion, send the empty string to
- DBSetCaseTables. When this happens, the global gDBWorldCase is set to the
- empty string. FileFlex then knows to skip the extra processing inherent in
- case conversion of world-aware data strings.
-
- Cautions: Be careful that the first parameter is the upper case table and
- the second parameter is the lower case table. Also make sure you pass two
- tables. Failure to pass two complete case conversion tables could cause
- unpredictable results and might lead to abnormal termination.
-
- Standalone Intelligent Case Conversion Functions
-
- In addition to doing intelligent case conversions within index and query
- functions, FileFlex provides you with the ability to do intelligent case
- conversions of standalone strings.
-
- The function DBUpper will convert a string intelligently from lower case to
- upper case. If case tables have already been set with DBSetCaseTables,
- DBUpper will use those tables, otherwise it will use the standard ASCII
- upper case conversion. Here's how to call DBUpper:
-
- put DBUpper(string) into newString
-
- Likewise DBLower will convert a string intelligently from upper case to
- lower case. If case tables have already been set with DBSetCaseTables,
- DBLower will use those tables, otherwise it will use the standard ASCII
- lower case conversion. Here's how to call DBLower:
-
- put DBUpper(string) into newString
-